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This article offers an efficient design and implementation of a discrete 
multiwavelet critical-sampling transform based orthogonal frequency 
division multiplexing (DMWCST-OFDM) transceiver using field 
programmable gate array (FPGA) platform. The design uses 16-point 
discrete multiwavelet critical-sampling transform (DMWCST) and its 


inverse as main processing modules. All modules were designed using a part 
of Vivado® Design Suite version (2015.2), which is Xilinx system generator 
(XSG), and is compatible with MATLAB Simulink version R2013b. The 
FPGA implementation is carried out on a Zynq (XC7Z020-1CLG484) 
evaluation board with joint test action group (JTAG) hardware co- 
simulation. According to the results obtained from the implementation tools, 
the implemented system is efficient in terms of resource utilization and 
could support the real-time operations. 
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1. INTRODUCTION 

Orthogonal frequency division multiplexing (OFDM) divides the available spectrum into several 
parallel subcarriers. Each one of these subcarriers is modulated by a data stream with low rate at different 
orthogonal carrier frequency [1]-[3]. In the traditional OFDM systems, the fast Fourier transform (FFT) is 
used for fulfilling the requirement of orthogonality between the different subcarriers [4]-[8]. FFT has a major 
drawback arising from using rectangular window, which creates high side lobes in the transmitting signals, 
thereby increase sensitivity of the OFDM system to the Inter-Carrier Interference (ICI) and a low system 
performance [9]-[14]. Discrete multiwavelet critical-sampling transform based OFDM system (DMWCST- 
OFDM) was proposed to mitigate the impairments of FFT based OFDM system [15]-[20]. In the proposed 
design, an inverse discrete multiwavelet critical-sampling transform (IDMWCST) and DMWCST are used in 
the realization of the OFDM system instead of inverse fast Fourier transform (IFFT) and FFT, respectively. 

Real-time implementation for OFDM systems plays a major role in next generation communication 
systems. Field programmable gate array (FPGA) give us a good solution in term of high-performance and 
low-cost to implement and verify the digital communication systems. Since, FPGA has high flexibility and 
can be upgraded continuously, it has become widely relied upon to implement digital signal processing 
(DSP) functions [21]-[24]. Veena and Swamy [25] presented the FPGA implementation of discrete 
multiwavelet over-sampling transform (DMWOST) for OFDM using Xilinx integrated synthesis 
environment (Xilinx ISE) targeting Virtex-5 FPGA. It was found that the DWMOST required 2695 slices and 
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51 mW power dissipation. On the other hand, DWMOST gives poor bit error rate performance as compared 
by DMWCST [20] will be presented in this work. While [26], offered the modified DMWOST architecture. 
Which was implemented on VirtexII FPGA and operates at 300 MHz frequency. The architecture occupied 
area of less than 1%, with consumed power of 28 mW. 

The aim of this paper is to present the design and implementation of the DMWCST-OFDM system 
at real-time using FPGA platform. The presented system has been built using a high-level design tool which 
is a part of Vivado® Design Suite called Xilinx system generator (XSG). XSG comes with a predefine Xilinx 
blockset along with MATLAB Simulink software, and can be used to implement the algorithms, and mapped 
directly into target FPGA hardware. MATLAB’s Simulink enables the designer to use high-level abstractions 
of the system that can be automatically compiled into FPGA. It provides the designer a thin boundary 
between hardware and software, given that it enables to design the hardware by allowing to synthesize the 
blocks into very high-speed integrated circuit hardware description language (VHDL) and compiling them 
into FPGA with a single click. 

The rest of the paper is organized in the following order. An overview of the DMWCST-OFDM 
system is detailed in section 2. Section 3 presents the implementation of the DMWCST-OFDM system on 
FPGA using XSG. Section 4 discusses the results, and the conclusions are presented in section 5. 


2. THE ORETICAL BASIS 

The block diagram of DMWCST-OFDM system is illustrated in Figure 1. At the transmitter side, a 
serial-binary data is generated randomly. The serial-to-parallel (S/P) conversion converts serial bit stream 
into parallel lower data sub-streams, which then formatted into symbols required for transmission. After that, 
each symbol is mapped according to the quadrature phase shift keying (QPSK) mapping technique, resulting 
in N, sub-channels. Then, several of null-subcarriers are added to the resulted data sequences. N-point 
IDMWCST is applied to the signal in order to achieve orthogonality between subcarriers. Finally, the data 
are sent to the receiver after being converted to a frame structure. The design of the receiver consists of the 
inverse operations of the transmitter side. These operations are performed in a reverse order to yield the 
correct data stream [20]. 


Transmitter 


Tx bit 


Receiver 


Rx bit 


Figure 1. Block diagram of the DMWCST-OFDM 


3. METHOD 

The DMWCST-OFDM system has been implemented by utilizing the parallelization capabilities of 
the FPGA in order to decrease the latency and get the final output from the design. Moreover, pipeline 
options available in the Xilinx blocks are exploited. The tools employed in this work are MATLAB R2013b- 
64 bit along with Simulink 8.2 and Vivado® Design Suite version 2015.2. The number of used subcarrier (Nò 
in this implementation is 16, while the number of useful subcarrier (No) is 12. The implementation of 
DMWCST-OFDM system consists of two parts: the transmitter and the receiver. The details of each part are 
presented in the following sub-sections. 


3.1. Transmitter section 

The first block in the transmitter side is S/P and mapping, the XSG implementation of this block is 
illustrated in Figure 2. The S/P block takes the serial unsigned data represented in the fixed point of one bit 
with zero binary point, and it produces a single output of two consequential input bits to be mapped to the 
corresponding QPSK symbol. The most significant word first is used to arrange the serial input data. The fact 
that the clock frequency reduces by half after S/P block needs to be taken into consideration. Then, the two 
bits have been separated into select signal for two multiplexers by using two slices in order to decide the I (In 
phase) and Q (Quadrature phase) values of the desired complex symbol. 


Field programmable gate array implementation of multiwavelet transform ... (Suha Qasim Hadi) 


51383 O ISSN: 2088-8708 


>| [a:b] 
Slice 1 MSB | | 
-0.70703125 (1) 
s AP Constant f 
Input data SIP Mux 1 
0.70703125 
Constanti 
P sel 
>| [a:b] 
do | —+(2) 
Slice 2_LSB Inn Q 
Mux 2 


Figure 2. XSG model of the S/P and mapping 


Then a set of null subcarriers are inserted to each OFDM symbol. The Xilinx implementation of this 
block is depicted in Figure 3. Two zeros are added at the beginning and the end of each OFDM symbol. An 
unsigned fixed-point representation is used to represent a zero value as the output of the constant block. The 
set of the multiplexers have been used to allocate the OFDM symbol and control the flow of the input to the 
next subsystem. They are controlled by the M-code and counter. The delay blocks at the input of the 
multiplexers are used to make each I or Q sample of one OFDM symbol enter the intended multiplexer 
simultaneously, which is the time of the last sample in that OFDM symbol. The zero constant block is also 
utilized to reset the multiplexers between the OFDM symbols. 
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Figure 3. XSG model of the zero padding 


The XSG implementation of IDMWCST, which is achieved by matrix multiplication as given 
in [20], consists of many blocks; one of them is used to perform all the multiplication processes required 
through the matrix multiplication, while the other blocks are used to perform the addition processes between 
the multiplication processes. Each element of the output will be the results of the addition blocks. In order to 
make the design to require less resources, all multiplication and addition processes with zero coefficients are 
not taken into consideration. The XSG implementation of addition blocks are presented in Figure 4. 

Finally, Figure 5 shows the implementation architecture of the P/S converter. Where, the incoming 
parallel I and Q data are converted to one serial sequence by the time division multiplexer. Then two 
multiplexers are used to separate the I and Q data for simultaneous transmission. 
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Figure 4. XSG implementation of addition blocks 
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Figure 5. XSG model of P/S converter 


3.2. Receiver section 

The first step in the receiver side is convert the I and Q components from serial to parallel by S/P 
converter. The Xilinx implementation of this step is depicted in Figure 6. The multiplexer is used to control 
the input flow, at the first 16 clocks, which is equal to period of the OFDM symbol, the I data will be passed 
to the time division demultiplexer, then at the second 16 clocks, the Q data will be passed. The time division 
demultiplexer will convert the I and Q components from serial to parallel. 

The second step is performing DMWCST. The XSG implementation of this block is also can be 
achieved by matrix multiplication between the transformation matrix given in [20], and the input vector of 
the I and Q. (Same as in the transmitter side the zero coefficients are not taken into the consideration). 

After that, the implementation architecture of the remove zero padding block is achieved as in 
Figure 7. There are two purposes for designing this block. The first is to remove the null subcarriers from 
each OFDM symbol, and the second is to compel the I and Q components to pass to next block 
simultaneously, because the I and Q components prior to this subsystem was passing sequentially. The 
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terminator blocks are utilized to remove the null subcarriers, which were added at the beginning and end of 
the OFDM symbol at the transmitter side. Then two multiplexers with delay blocks are used to separate the I 
and Q component to compel them to pass through the next subsystem simultaneously, which are controlled 
by the counter and the M-code. 
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Figure 6. XSG model of the S/P converter 
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Figure 7. XSG model of the remove zero padding 
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Finally, the hard decision method is used to perform the QPSK demapping. First, the I and Q values 
are compared with a threshold of value 0. Then, the decisions are grouped through the concat block to 
configurate a 2-bit QPSK symbol. Next, the QPSK symbol is converted to binary data using a P/S conversion 
block. The XSG implementation module of this step is presented in Figure 8. 
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Figure 8. XSG model of the de-mapping and P/S converter 


4. RESULT AND DISCUSSION 

Figure 9 shows a hardware co-simulation model for the OFDM-DMWCST system and its 
programming to Zynq (XC7Z020-1CLG484) FPGA board. It uses a joint test action group (JTAG) 
connection. The VHDL codes are generated after the co-simulation. From this figure, it is clear that the 
output from the hardware co-simulation model matches the input data. So, the programming of the FPGA 
board has been successful. 

The VHDL coding of the system is accomplished through the hardware co-simulation process. The 
generated VHDL codes are imported into the Vivado tool for analysis at the register transfer level (RTL), 
where synthesis and implementation operations are performed. These operations will produce a large number 
of reports. The reports describe concerns regarding the implementation processes such as storage resources 
required, I/O resources required, and power required. 
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Output from Simulation Model 
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Figure 9. Hardware co-simulation for the OFDM-DMWCST 


Tables 1 and 2 give a brief description of the area results in the context of resource consumption for 
the transmitter and receiver of the Xilinx OFDM-DMWCST, respectively. From these figures, it is noticed 
that the design is efficient in terms of resource utilization. Finally, summery results of the proposed OFDM- 
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DMWCST model are compared with the reference design [25], [26]. The comparison results are reported in 
Table 3. Table 3 indicates that our Xilinx design consumes little power compared with the total power of the 
board. Where, the transmitter consumes 12 mW, which is equal to half the amount of power consumed by the 
reference [25]. 


Table 1. Area of utilization report for the transmitter 


Resource Utilization Available Utilization (%) 
FF 276 106400 0.26 
LUT 3441 53200 6.47 
Memory LUT 86 17400 0.49 
vO 26 200 13.00 
BUFG 1 32 3.12 


Table 2. Area of utilization report for the receiver 


Resource Utilization Available Utilization (%) 
FF 700 106400 0.66 
LUT 3363 53200 6.32 
Memory LUT 116 17400 0.67 
vO 26 200 13.00 
BUFG 1 32 3.12 


From the above results, it is clearly that the design consumes less than 1% resources and low power. 
The design gave a better performance for its latency and throughput by designing a parallel and pipelined 
architecture for the proposed model compared with [25], [26]. So, the proposed design is suitable for real 
time and low power applications. 


Table 3. Summary results reports compared with references [25], [26] 


Parameter References [25] References [26] Proposed design 
FF 1996 72 700 
Memory LUT ---- 126 116 
VO --- 64 26 
BUFG --- 1 1 
Power Dissipation 24 mW --- 12 mW 


5. CONCLUSION 

A baseband OFDM-DMWCST system was designed and successfully implemented on Zynq 
(XC7Z020-1CLG484) FPGA board using XSG tool, which is a part of Vivado® Design Suite software. XSG 
combined with MATLAB/Simulink provided an easier and efficient way of implementing the OFDM- 
DMWCST on FPGA. Also, the hardware co-simulation feature of the XSG software paved the way for an 
easier approach in testing and debugging the system effectively at real-time. The hardware simulation and 
synthesis results showed that the implemented system is correctly working, efficient in terms of resource 
utilization, and supports real-time operations. 

As, the literature referred presented the FPGA implementation of DMWOST based OFDM, where 
focused on implementation of the transform only and not implement the OFDM as a complete system 
compared with our work. In our work a new implementation of DMWCST based OFDM is designed and 
synthesized. So, the results obtained are considered as reference design for DMWCST based OFDM 
architecture on FPGA. 
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